Improving the Performance of an Example-Based Machine Translation System Using a Domain-specific Bilingual Lexicon
نویسندگان
چکیده
In this paper, we study the impact of using a domain-specific bilingual lexicon on the performance of an Example-Based Machine Translation system. We conducted experiments for the EnglishFrench language pair on in-domain texts from Europarl (European Parliament Proceedings) and out-of-domain texts from Emea (European Medicines Agency Documents), and we compared the results of the Example-Based Machine Translation system against those of the Statistical Machine Translation system Moses. The obtained results revealed that adding a domain-specific bilingual lexicon (extracted from a parallel domain-specific corpus) to the general-purpose bilingual lexicon of the Example-Based Machine Translation system improves translation quality for both in-domain as well as outof-domain texts, and the Example-Based Machine Translation system outperforms Moses when texts to translate are related to the specific domain.
منابع مشابه
Building Multiword Expressions Bilingual Lexicons for Domain Adaptation of an Example-Based Machine Translation System
We describe in this paper a hybrid approach to build automatically bilingual lexicons of Multiword Expressions (MWEs) from parallel corpora. We more specifically investigate the impact of using a domain-specific bilingual lexicon of MWEs on domain adaptation of an Example-Based Machine Translation (EBMT) system. We conducted experiments on the English-French language pair and two kinds of texts...
متن کاملEvaluating the Impact of Using a Domain-specific Bilingual Lexicon on the Performance of a Hybrid Machine Translation Approach
This paper describes an Example-Based Machine Translation prototype and presents an evaluation of the impact of using a domainspecific vocabulary on its performance. This prototype is based on a hybrid approach which needs only monolingual texts in the target language and consists to combine translation candidates returned by a cross-language search engine with translation hypotheses provided b...
متن کاملMachine Translation for Multilingual Troubleshooting in the IT Domain: A Comparison of Different Strategies
In this paper, we address the problem of machine translation (MT) of domain-specific texts for which large amounts of parallel data for training are not available. We focus on the IT domain and on English to Portuguese machine translation, and compare different strategies for improving system performance over two baselines, the first using only large dataset of out-of-domain data, and the secon...
متن کاملSemi-Automatic Acquisition of Domain-Specific Translation Lexicons
We investigate the utility of an algorithm for translation lexicon acquisition (SABLE), used previously on a very large corpus to acquire general translation lexicons, when that algorithm is applied to a much smaller corpus to produce candidates for domain-specific translation lexicons. 1 I n t r o d u c t i o n Reliable translation lexicons are useful in many applications, such as cross-langua...
متن کاملImproving Statistical Machine Translation Using Domain Bilingual Multiword Expressions
Multiword expressions (MWEs) have been proved useful for many natural language processing tasks. However, how to use them to improve performance of statistical machine translation (SMT) is not well studied. This paper presents a simple yet effective strategy to extract domain bilingual multiword expressions. In addition, we implement three methods to integrate bilingual MWEs to Moses, the state...
متن کامل